Chapter 20
Getting the Hint from Epidemiologic
Inference
IN THIS CHAPTER
Choosing potential confounders for your regression model
Using a modeling approach to develop a final model
Adding interactions to the final model
Interpreting the final model for causal inference
In Parts 5 and 6, we describe different types of regression, such as ordinary least-squares regression,
logistic regression, Poisson regression, and survival regression. In each kind of regression we cover,
we describe a situation in which you are performing multivariable or multivariate regression, which
means you are making a regression model with more than one independent variable. Those chapters
describe the mechanics of fitting these multivariable models, but they don’t provide much guidance on
which independent variables to choose to try to put in the multivariable model.
The chapters in Parts 5 and 6 also discuss model-fitting, which means the act of trying to refine your
regression model so that it optimally fits your data. When you have a lot of candidate independent
variables (or candidate covariates), part of model-fitting has to do with deciding which of these
variables actually fit in the model and should stay in, and which ones don’t fit and should be kicked
out. Part of what guides this decision-making process are the mechanics of modeling and model-fitting.
The other main part of what guides these decisions is the hypothesis you are trying to answer with your
model, which is the focus of this chapter.
In this chapter, we revisit the concept of confounding from Chapter 7 and explain how to choose
candidate covariates for your regression model. We also discuss modeling approaches and explain
how to add interaction terms to your final model.
Staying Clearheaded about Confounding
Chapter 7 discusses study design and terminology in epidemiology. As a reminder, in epidemiology,
exposure refers to a factor you hypothesize to cause a disease (or outcome). In your regression model,
the outcome is the dependent variable. The exposure will be one of the covariates in your model. But
what other covariates belong in the model? How do you decide on a collection of candidate-
independent variables that you would even consider putting in a model with the exposure? The answer
is that you choose them on the basis of their status as a potential confounder.
A confounder is a factor that meets these three criteria: